40 research outputs found
SHORT-TIME BEHAVIOR OF THE CORRELATION FUNCTIONS FOR THE QUANTUM LANGEVIN EQUATION
We analyze the quantum Langevin equation obtained for the Ford-Kac-Mazur and related models. We study an explicit expression for the correlation function of the noise, obtained by making use of the normal-ordered product of operators. Such an expression is divergence-free, does not require any frequency cutoff, and yields the classical (Markoffian) case in the limit of vanishing \ensuremath{\Elzxh}. We also bring to light and discuss two different regimes for the momentum autocorrelation. The high-temperature and weak-coupling limits are considered, and the latter is shown to be related to van Hove's ``{\ensuremath{\lambda}}^{2}T'' limit. \textcopyright{} 1996 The American Physical Society
Single step optimal block matched motion estimation with motion vectors having arbitrary pixel precisions
This paper proposes a non-linear block matched motion model and solves the motion vectors with arbitrary pixel precisions in a single step. As the optimal motion vector which minimizes the mean square error is solved analytically in a single step, the computational complexity of our proposed algorithm is lower than that of conventional quarter pixel search algorithms. Also, our proposed algorithm can be regarded as a generalization of conventional half pixel search algorithms and quarter pixel search algorithms because our proposed algorithm could achieve motion vectors with arbitrary pixel precisions
Improved CNN-based Learning of Interpolation Filters for Low-Complexity Inter Prediction in Video Coding
The versatility of recent machine learning approaches makes them ideal for
improvement of next generation video compression solutions. Unfortunately,
these approaches typically bring significant increases in computational
complexity and are difficult to interpret into explainable models, affecting
their potential for implementation within practical video coding applications.
This paper introduces a novel explainable neural network-based inter-prediction
scheme, to improve the interpolation of reference samples needed for fractional
precision motion compensation. The approach requires a single neural network to
be trained from which a full quarter-pixel interpolation filter set is derived,
as the network is easily interpretable due to its linear structure. A novel
training framework enables each network branch to resemble a specific
fractional shift. This practical solution makes it very efficient to use
alongside conventional video coding schemes. When implemented in the context of
the state-of-the-art Versatile Video Coding (VVC) test model, 0.77%, 1.27% and
2.25% BD-rate savings can be achieved on average for lower resolution sequences
under the random access, low-delay B and low-delay P configurations,
respectively, while the complexity of the learned interpolation schemes is
significantly reduced compared to the interpolation with full CNNs.Comment: IEEE Open Journal of Signal Processing Special Issue on Applied AI
and Machine Learning for Video Coding and Streaming, June 202
A benchmark of visual storytelling in social media
CMUP-ERI/TIC/0046/2014Media editors in the newsroom are constantly pressed to provide a "like-being there" coverage of live events. Social media provides a disorganised collection of images and videos that media professionals need to grasp before publishing their latest news updated. Automated news visual storyline editing with social media content can be very challenging, as it not only entails the task of finding the right content but also making sure that news content evolves coherently over time. To tackle these issues, this paper proposes a benchmark for assessing social media visual storylines. The SocialStories benchmark, comprised by total of 40 curated stories covering sports and cultural events, provides the experimental setup and introduces novel quantitative metrics to perform a rigorous evaluation of visual storytelling with social media data.publishersversionpublishe
Interpreting CNN for low complexity learned sub-pixel motion compensation in video coding
Deep learning has shown great potential in image and video compression tasks. However, it brings bit savings at the cost of significant increases in coding complexity, which limits its potential for implementation within practical applications. In this paper, a novel neural network-based tool is presented which improves the interpolation of reference samples needed for fractional precision motion compensation. Contrary to previous efforts, the proposed approach focuses on complexity reduction achieved by interpreting the interpolation filters learned by the networks. When the approach is implemented in the Versatile Video Coding (VVC) test model, up to 4.5% BD-rate saving for individual sequences is achieved compared with the baseline VVC, while the complexity of learned interpolation is significantly reduced compared to the application of full neural network
Chroma Intra Prediction with attention-based CNN architectures
Neural networks can be used in video coding to improve chroma
intra-prediction. In particular, usage of fully-connected networks has enabled
better cross-component prediction with respect to traditional linear models.
Nonetheless, state-of-the-art architectures tend to disregard the location of
individual reference samples in the prediction process. This paper proposes a
new neural network architecture for cross-component intra-prediction. The
network uses a novel attention module to model spatial relations between
reference and predicted samples. The proposed approach is integrated into the
Versatile Video Coding (VVC) prediction pipeline. Experimental results
demonstrate compression gains over the latest VVC anchor compared with
state-of-the-art chroma intra-prediction methods based on neural networks.Comment: 27th IEEE International Conference on Image Processing, 25-28 Oct
2020, Abu Dhabi, United Arab Emirate